Computational Models for Speech Production

نویسنده

Li Deng

چکیده

Major speech production models from speech science literature and a number of popular statistical “generative” models of speech used in speech technology are surveyed. Strengths and weaknesses of these two styles of speech models are analyzed, pointing to the need to integrate the respective strengths while eliminating the respective weaknesses. As an example, a statistical task-dynamic model of speech production is described, motivated by the original deterministic version of the model and targeted for integrated-multilingual speech recognition applications. Methods for model parameter learning (training) and for likelihood computation (recognition) are described based on statistical optimization principles integrated in neural network and dynamic system theories.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech

This paper presents a computational model of human speech production based on the hypothesis that low energy attractors for a human speech production system can be identified, and that interpolation/extrapolation along the key dimension of hypo/hyper-articulation can be motivated by energetic considerations of phonetic contrast. An HMM-based speech synthesiser along with continuous adaptation o...

متن کامل

Establishing some principles of human speech production through two-dimensional computational models

Human speech production is often described as an optimisation process, which tends to maximise the effectiveness of the communication process minimising the effort involved in the production. The aim of this paper is to investigate this highly complex problem with two dimensionally reduced spaces corresponding to different computational models. Since the highdimensional parameter space which us...

متن کامل

Computational Models for Auditory Speech Processing

Auditory processing of speech is an important stage in the closed-loop human speech communication system. A computational auditory model for temporal processing of speech is described with details of numerical solution and of the temporal information extraction method given. The model is used to process fluent speech utterances and is applied to phonetic classification using both clean and nois...

متن کامل

Articulatory Phonology, Task Dynamics and Computational Adequacy

This paper discusses articulatory phonology and task dynamics as potentially computationally adequate models which, together, might characterise speech production. The idea is introduced that, particularly at the task dynamic level, the object oriented computational paradigm is appropriate — this is a novel approach in speech production modelling. The paper concludes that articulatory phonology...

متن کامل

Coexpressivity of speech and gesture: Lessons for models for aligned speech and gesture production

When people combine language and gesture to convey their intended information, both modalities are characterized by an intriguing degree of coherence and consistency. For developing an account how speech and gesture are aligned to each other, one question of major importance is how meaning is distributed across the two channels. In this paper, we start from recent empirical findings indicating ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Computational Models for Speech Production

نویسنده

چکیده

منابع مشابه

C2H: A Computational Model of H&H-based Phonetic Contrast in Synthetic Speech

Establishing some principles of human speech production through two-dimensional computational models

Computational Models for Auditory Speech Processing

Articulatory Phonology, Task Dynamics and Computational Adequacy

Coexpressivity of speech and gesture: Lessons for models for aligned speech and gesture production

عنوان ژورنال:

اشتراک گذاری